1
How Practical Big Data Management
Can Drive Value in Healthcare
Big Data Symposium Session BG5, Monday February 11, 2019, 2:30-3:30PM
Jason Burke
System VP and Chief Analytics Officer
UNC Health Care
2
I have no real or apparent conflicts of interest to report.
Conflict of Interest
3
Define the evolving role of data management in the field of health
care “big data”
Discuss some of the new competencies for effectively leveraging
big data at scale
Share some examples of how this is being applied at a large
health care system
Agenda
4
1. Identify data management and data governance best practices
that are essential to a Big Data ecosystem
2. Explore how governance applies to new and existing Big Data
programs
3. Assess how an effective Big Data strategy can address
challenges and enhance data sharing efforts while being mindful
of big data ethics
4. Demonstrate real-world implementations of Big Data
management best practices in healthcare and how they support
value delivery and strategic outcomes
Learning Objectives
5
Here’s Your Report!
Starting with the end in mind…
6
The Changing Role and Scope of
“Data Management”
1990s 2000s 2010s 2020s
Managing files
Managing operational databases
Managing data warehouses
Managing online data
Massive data
Unmanaged
The exponential growth of data
drives an explosive growth in data
management issues
7
Valuation Area 2000 2020
What is the source of the
data?
We are ?
What do we know about this
data and what it means?
It matches our operations ?
What’s in the data; how
representative is it?
It matches our operations ?
How was the data obtained
and managed?
Through our systems ?
Who else is using this data? Just us ?
How consistent is the data
with other data?
There is no other data Inconsistent
How good was the process
used to create this data?
It matches our operations ?
How current is the data? It matches our operations ?
How much is it changing? It matches our operations A lot
Human Valuation and Big Data
8
We have to know and manage a lot ABOUT our data,
not just the data itself
Implications of Today’s Big Data
What is the source of the data?
What do we know about this data and what it means?
What’s in the data; how representative is it?
How was the data obtained and managed?
Who else is using this data?
How consistent is the data with other data?
How good was the process used to create this data?
How current is the data?
How much is it changing?
Lineage / Pedigree
Business Context
Quality
Reasonableness
Consistency
Pervasiveness
Socialization
Controls
Currency
Volatility
9
Image via Flickr user horiavarlan
PRACTICE-BASED
EVIDENCE
Why is this so different?
global
institutional
individual
multi-institutional
SCOPE
controlled semi-controlled real world
cohorts
CONSTRAINTS
QUALITY
IMPROVEMENT
CLINICAL
RESEARCH
10
Dismantling the Hype
A technology problem
You can fix it with text mining,
machining learning, AI,
Hadoop, or another buzzword
Standards is the fix
Sticking to one vendor is a fix
Outsourcing helps
A people / process problem
Garbage in, garbage out…but we
can improve it even if we can’t fix it
Standards help a lot
Sticking to one vendor is
impossible, but does help
No vendor knows your business
better than you
The Buzz The Reality
11
Harnessing Big Data is an Organizational Competency
Data
Information Knowledge Insight Action
Value
Raw Data
Context Meaning Purpose
Decision Positive Change
People & Process
Investments
12
Designing for Reusability
Image courtesy of http://www.flickr.com/photos/wonderlane
13
Stakeholders are project area experts
Effort is focused on predefined questions
Work is relevant to project team
Timeline is project driven
Data definitions are project specific
Data structured for single use
Little-to-no analytical code reuse
Release available to project stakeholders
Stakeholders are functional experts
Questions are not predefined
Work must be relevant to multiple customers
Timeline is engineering driven
Data definitions are enterprise-wide
Data is structured for broad re-use
Analytical models are built for multiple projects
Release available to entire enterprise
Reusability in Analytics: Products!
PROJECTS
PRODUCTS
14
Investing in Competencies
Data Quality
Management
Analytics
Competency
Development
& Staffing
Data Policies
and
Standards
Master
Data
Management
(MDM)
Metadata
Management
Community
Engagement
Knowledge
Management
Governance &
Decision
Making
Asset
Provisioning,
Management
& Certification
Engineering
Management
Architecture
Design and
Management
Data Strategy
Formalism
Data Model
Engineering &
Management
(Domains)
Analytical
Model
Engineering &
Management
Data
Operations
Management
and Controls
Data Roles &
Stewardship
Consulting
and Guidance
ENGINEERING
OPERATIONS
Lifecycle and
Quality
Management
DATA
GOVERNANCE
BUSINESS
OPERATIONS
15
Like many academic medical systems, UNC HCS has a diversified,
empowered culture
Organizational units with deep subject matter experts (SMEs)
Spirit of research, innovation, and entrepreneurship
Any journey with analytics must respect those cultural norms
Fully centralized “ivory tower”, bottlenecks, loss of institutional
context and SME
Fully federated no economies of scale, impossible to establish a
single source of truths
We opted to pursue a hybrid model
Centralize building reusable assets
Federate the use and extension of those assets
Bring federated SMEs into all build-related work
Help federated users be more effective with data / analytics
Business &
Clinical
SMEs
Technology
SMEs
Analytics
SMEs
Centralized vs. Federated Capability Development
Designed Into
Governance
Solution Development
System Enhancements
ORBITs
16
Data Governance
17
Data Governance Program Pillars
(Knowledgent, Mikol)
Data
Quality
Mgmt
Analytics
Competency
Data
Domains
Data Policy
Master
Data
Mgmt
Metadata
Mgmt
Application inventory
System interfaces
inventory
Business glossary
Application data
dictionaries
Core reports inventory
Data release inventory
Data Profiling
Transparency of results
Quality thresholds
Prioritized remediation
efforts
Source of truth for key
shared data
Common terms and
groupings for
integrated analytics
Assign decision rights
and accountabilities
Permissions
Standardize data
movement and access
approval methods
Further define regulation
DG Maturity
Group data into
unique domains
Nominate Information
Owners
Data Steward selection
and engagement
Tool inventory
Remove duplication of
efforts and solutions
Clearly prioritize
analytics initiatives
Drive progress toward
self- service analytics
18
Asset Certification
Certification is a review process of key elements and tools used for decision making across the
enterprise. Certification provides:
o Reliability
o Traceability
o Standardization
o Documentation
Certification Levels
o Gold
o Silver
o Bronze
Certification Drivers
o Initiator
o Stakeholders
o Data Governance
Asset
Provisioning,
Management
& Certification
19
Community Knowledge & Culture
Community
Engagement
Knowledge
Management
20
Data availability velocity > policy development velocity
Obtaining multidisciplinary perspectives on a big data opportunity up
front is critical to ensuring the right questions are being asked
Questions to ask as soon as possible:
Who are the stakeholders for this work?
Who has granted consent for this work, when, and why?
What are potentially negative outcomes of this work, and who would
be impacted (e.g., sponsors vs. owners vs. stakeholders)?
If we told patients and/or physicians we were doing this, what would
they think?
Where is the line between “improvement” and “research”?
What controls can be used to mitigate potentially negative impacts
of this?
A Word on Ethics
21
Common views of
health care system
utilization through
shared semantic
representation
Requires data domains
through a more
sophisticated data
strategy linked to data
governance
Example: Utilization Modeling
22
Example: Patient Throughput Optimization
Bed Assignment
Physical Capacity
Order Triage
Throughput Optimization
Using discrete event simulation to drive operational efficiencies
23
Example: Emergency Operations
Dynamic ED Dashboard
ED Surge Model
ED Throughput Optimization
Models can be interpreted consistently
despite different methods and focus
24
Example: Care Variation
Mapping clinical context to big data
Individual diseases
have disease-specific
models
Performance is defined
against system-level
standards
Analytics are used to
normalize comparisons
25
Do we care about “big data” or “big insights”?
Image courtesy of http://www.flickr.com/photos/strangrthancandy
26
DATA Issues
Storage
Structure
Timeliness
Semantics & Language
Validity
Reliability
Triage
Pedigree
INSIGHT Issues
Innovation
Health Outcomes
Profitability
Productivity
Translational Science
Customer Intimacy
Risk
Value
Image courtesy of http://www.flickr.com/photos/jdhancock
27
Summary
1. Deriving value from big data is about a lot more than traditional
“data management”
What we know about our data
What we know from our data
2. Data governance is one of the key domains required to effectively
operationalize big data
3. Routine value creation from big data is dependent on growing and
transitioning enterprise capabilities
Reusable designs and assets
New process development
Business and clinical engagement
4. Governance programs need to be ever mindful of ethics
considerations
New data use is sometimes unplanned data use
28
Jason Burke
System VP and Chief Analytics Officer
Email: Jason.Burke@unchealth.unc.edu
Twitter: @jaburke
LinkedIn: jasonburke
Please complete online session evaluation!
Questions